DOCUMENT RESUAE | 
BD 134 938 Pa: cS 003 193 


AUTHOR a génkins, Sceenh Re? Pany, Darlene 
TITLE Y Curriculus ‘Biases ‘in Reading Achievenent Tests. 
Technical Report: No. 16. # 
“INSTITUTION Bolt, Beranek and Newman, Inc., Cambridge, Mass.; 
: Illinois. Univ., Urbana. Center for the Study of 
; Reading. | 
SFONS AGENCY National Inst. of Education (DHEW) , Washington, 
i . Z ks De Cc. : 
PUB DATE - Nov .76 | 
, CONTRACT | ' 400-76-0116 
EDRS; PRICE HF-$0.83 HC-$1. 67 Plus Postage. . 
DESCRIPTORS Basic Reading; +Beginning Reading; *#corriculos 


Evaluation; Grade 1; Grade 2; Primary Bducation; 
Program Evaluation; Reading Katerials; ‘*Reading . 
Research; *Reading Tests; _ Standardized’ Tests; *Test 
: Bias 
IDENTIFIERS *Center ‘€or the Study of Reading. (Illinois). . 


BESLEAES ; : : 
-*The extent and direction of curriculua bias in 
etandirvdived reading achievement tests were examined. Bias was a: 
~ estimated hy comparing the relative overlap in the contents ‘of five 
separate reading achievement tests with the contents of’seven . - 
commercial redding series at first-grade and second-grade levels. ' 
Overlap between each achievement test and each reading series is 
reported in terms of, achievement test grade equivalent scores that 
would ‘be ected, given mastery of the’ words which appear both as 
content a‘reading series and as achievement test items. Results 
indicate clei discrepancies between the grade eguivalént scores 
obtained, both between tests for a single curriculum and ona single 
test for different reading curricula. The isplications of the 
apparent curriculum bias of achievement tests are discussed as they 
relate to teacher, child, and curriclum evaluation, to reading 
placement, and to applied educational research. (Author) 


ae 


TTT O CITC ererr rir rererrr irre rire rr rerio rerio ree error 
* Documents acquired by BRIC include many inforgal unpublished * 
* ‘materials nct available from other sources. ERIC makes every effort * 
* to obtain the best copy available. Nevertheless, items of marginal * 
* reproducibility are often encountered and this affects the quality * 
- * of the micrcfiche and hardcopy reproductions ERIC sakes available * 
* via the ERIC Document Reproduction Service (EDRS). EDRS is Hot: 7 
* responsible for the quality of- the original document. Reproductions * 
* * 
2 * 


supplied by. EDRS are the best that can be made from the original. 
aon b04g 404044455549 00560698040990050000400088044408008080457008 04088 


} 


_* 


University of Illindis 
° gat Urbana-Champaign 


“1005 West Nevada Street — 


Urbana, ‘Illinois 61801 


& 


CS 993 /F3B 


: i. 4 
The research reported herein was supported in 
of Education under Contract No. MS-NIE-C-400- 


*Designation as a Center pending approval. 


v "4 


. 


Pe ie ; us DRPARTMANT OR HEALTH, a 4 
| t nt ; EDUCATION 6 WELFARE 
€0 a MnecenaT= er 
rm : F.. petty 2s rete eee 
7 eh CENTER FOR THE STUDY OF READING* ATING 11 POINTS OF view Of OPINIONS 
an SENT OFF CIAL NATIONAL INST! Teor “ 
: NN = EDUCATION POSITION OR POLICY mt 
De | : \ 
= - 
Loy % 
* ‘ * 
Technical Report No. 16 
CURRICULUM BIASES IN READING ACHIEVEMENT TESTS =. e | 
‘Joseph R. Jenkins and Darlene Peny Sy st 
University: of Illinoie at Urbana-Champeign : gr s ahs 
| p November 1976 ‘ s 
e* . .Y * 
ie Sp 
~ 
rd i 
| fo “ 
7 : ' { ; 
'e = ¢ ‘ m 
€ 


‘Bolt, Nabindte and Newsan Inc. 
50 Moulton Street ; ; 
_ Seabetaes, Maapacnneatee 02138 


t by the National In sticued 
EAE rad ‘i x 


Abstract ; 
- Z . ut y e, 


The, extent an “direction of ‘currfoulun bias in stand gized reading ie 
achievement ti tests are examined, ‘Bias vis estimated by comparing tthe 
: relative overlap tn the contents of five separate reading achievement : 
tests with fhe content of seven, commercial eae 2 series at’ firet and ' 
second grade. levels, eee? between each achieveyent test. ‘and pach 


‘readingsseries te reported in terms of achievement test grade shivdae® yg .. 


é 


» 


scores that would be expected given mastery of - ‘thie words which appear both rl 
gh. = Sele 
as content in a teading series and as achievement .tést items. Results * 


» t ’ Les \. ‘ 
indicate clear discrepancies between \the grade equivalent’ scores obtaingd FZ 


; oe \ pe : 
both between tests for a single curriculum and on a single test for . / 

: ae 
different reading curricula. The implications of the apparent curriculum. 


bias of achievement _tests are discussed as they relate to eter child, 


and curriéyzum evaluation, to redding Phacenent and te apeties efca- , ; 


> 


ional peaearbhe . 


* 


6 


CURRICULUM BIASES IN BEAOUIG ACHJEVEMENT TESTS 1 


| 


Joseph R. peaking 


and 


¢ 


“Darlene Pang 
‘ University of I11inots ‘at Urbana-Champaign | ; | 
. 4 
Information furnished by standardized, norm cdtucetvnd achievement. 
" tests jpfluences la broad range of educational de ‘amine “Government ¥ 


‘ agencies use achievement test results to assess the impact, of federally 


ee ae dace Sora’ programs. School board budget allocations are 


nodified by ae fevement test résulte. A 


ievement test Séuitte to 4 tify . 
-/ 
supportive services. ‘Educators ae 


ols cesté. ‘School psychologists rely on 


> 


th = for special ase 


o/ 


oe at ‘a number le levels use them to /evaluate curricula. Researchers use % ' 


| fe’: sn and steeds r innovations. aes of course, teachere use 


Py 


ee 


fo 24 v 
je Test developers hav. openly | encouraged consumer ponfidence in thely 


‘i a : instruments. Tbr product endorsements apacttlog htt detail the variety 
/ i / 
/ of appropriate uses of achievement test results. 7 * 


| 6 ~ Pi eu - 
J 
| ° . . ~ 
/ a ‘ 
/ ’ 
‘ 
. 

. 


vA 


wok 


Achievement tests: 


@Btheir developers 


3 
kel ’ 
ae é 


",+. (tell) what pupils have ‘learned in school” acetoestital 


Achievement Test Primary | 3 Teacher's. Handbook, 1970, p.. 2), 
“e 


neat ‘provide a basis for reporting pupils' achievement: to 


parents" as well as permit one: "to compare present and 
ine and a@valuate the 


past achievement in order to detetuine 
_ Fate of progress" (Ss tunford Achievement Primary I Battery 
| Directions : :for Administering, 1964, p. 30), ; 
i 
cess © [Berve as)‘ rning signals [to] give che pupil special 
work of the regular instructional. program 


; fxelp within the f 
ik...request help ftom various specialists in the-sctiool" 


re (Metro olitap. Achievement Test Primary I Teacher’ 8 Handbook, 
} “1970, -p. 3) 9 2 


_ eae [permit] . the determination of instructional levels in 
school children" and “the assignment: of children to 


|  dnstructional groups. PP (Wide Range Achievement Test 


Manual, 1965, ‘ D, 


... are a source of information on which ,to base major 


| i : 
* curriculum changes'.{Stanford Achievement Test Primary I 
| Battery Directions for Administering, 1964, p. 30). 


“ In spite of th fact that achievement ces Mire highly touted by 


| 
{ 
| 


d publishers, they are not without their critics. 


| Objections to conventional achievement tests have generally taken one 


~ 


of ae Advocates of criterion-referenced testihg argue that 


norm-réferenced measures tell iittle about what an individual child 


has learned or not learned (Carver,!1972). 


| « 
Instead, norm-referenced 


tests can indicate only how a particular child's score compares with 


scores obtained by children in the norming sample (Popham, 1974). 


have criticized achievement tests on the basis of research which indicates 


* 


e 


that achievement test performance often fails to correspond with performance 


Others -'' 


in actual classroom cur~@gula’ (Brown, Note 1; Glaser, Note 2; McCraci-en 


1962; Sipay, 1964). Carver has concluded’ that 


"... grade level scores on 


reading tests have no connection with grade level difficulty of basal 


‘teaders or other curriculum materials" (Carver, 1972, p. 300). Eaton 


and Lovitt (1972) have nibtheiaiee presented data which raises doubts ° 
about the capacity of achievement tests to meadure children's annual 
academic growth. ss 2 - 
af . Despite growing jusstcpeicok Konyeaitbinat-aeidevenent tests tn some 
, circles, the educational ee ee to place enormous confidence 
in them; when wchievenndt test results-run counter to teachers' per- 
ceptions of children's Progress, the achievement tege score a Bee 
eecepeed as the more vale assessment, When a child receives a low pore ; 
on a test, ‘dt ‘is the child, the teacher, snajine? the curriculum that is 
= blamed. Unprepared children, inadequate curricula, and unsystematic 


eg . po » 


teachers are definitely plausible explanations for poor test pegformance. 

iowaved, there is another explanation that is rarely considered, namely 

that achievement tests may not measure what, was taught. The present 

investigation focuses attention on this fatter interpretation and examines 
P the eikent to which. reading achievenent teste may, not adequately sample 


_ particular instructional programs, even i the instructional programs ’ 


' . may themselves be adequate. 


Most conventional reading achievement measures are composed of one 


or more subtests such as word recognition, vocabulary meaning, and 
: . 


e F 
comprehension. Each of thesé tests aré, in turn, composed of a particular 


——-+— 


: on to | 
\", \* set of words that the ;child must be. able to ie Fae developers assure 
: ; Te \ 


anes is the consumer that the test items (words) are a representative sample of ne 


| d 
? ‘\ words taught in a’ wide variety of reading curricula (Sort Directions, 


u i PIAT Manual, 1970; sitive olitan Achievement ‘Test Primary I Teacher's 


\ Handbook, 1970). Despite these aosuralces, it is entirély possible that 


» 


the sample of words appearing on a FORaing. achievement test overlaps 
the words taught in one curriculum more than ne taught in another. 
Reading tests gould. in fact, be positively or negatively biased toward 


a apaditic reading curriculum by virtue of the particular sample of test 


‘words. Such biases might be detected by determining the overlap between 


’ 


various reading curricula and various achievement measures. ; 


The authors recognize that content overlap between reading curricul 
and achievement tests is not. the offty factor which determines how children 
See ina pencicatar, curriculum will perform on an achievement,test. 

In some inatondad, children will correctly identify on a test vords which 
were not directly taught in their reading curriculum; they may have 
learned words from sources other than their reading program (e.g., £ 
television, family members and peers). They also may decode some 
unfamiliar test words by applying phonic rules that were taught directly 
(synthetic phonics) or indirectly (analytic phonics) (Chall, 1967). On 

- the sthawikeua children ae fail to read some words on an achievement 
fest, even though the words were included in their reading program. They 
may no have mastered those words in ee first place, or at the time of 
‘ithe t@ést they may have forgotten words that they once knew. In spite of 
the fact that performance may reflect factors other than the Pre 
curriculum, it seems safe to assume that the content words of a reading 
curriculum make the single, largest contribution to a child s reading . 
vocabulary: Factors other than the content words themselves might be 
expected to ceunterbalance one another so that reported grade equivalent 


. 


test scores, estimated solely from curriculum content words, should 


< “ ‘ fa 
reasonably indicate both the extent.and direction of eunetoulen bias in 


~ 


selected reading achievement tests. 7 


4 . 


q 


Just how ee are normalized test‘ scores vnc related to specific 
veeaide curricula? ‘How much weight should be placed on “those test results 
in terms of evaluating or placisg students, iss comnunicating information 
‘+ about indjvidual or _group achievement in a given school year? To address 
‘these questions, ene authors assessed . the extent and direction of curriculum 
nee in five widely emptoyed standardized achievement tests: the Wide _Range 
Aalnedennesk Test (WRAT) : the Peabody Individual Achievement Test (PIAT) ; ‘the 
Metropolitan Achievement Test (MAT); the Stanford Achievement Test (SAT) ; 
and the Slesson Oral Reading Test (SORT). Bias was eipebiad by comparing 
the relative overlap in the contents of these different reading achievement 
tests with the first and sécond grade contents of seven commercial reading 
series: Economy (Keys to  Readine) Ginn Gesadlas 360) ; Machi1an (The 
Bank Street Readers); Macmillan (Macmillan Reading Program) ; Houghton- 
MiffTin (Reading for Meaning); Science Research Associates (The SRA Reading 
Program) ; and, McGraw-Hill (Sullivan Associates Programmed Reading). 
; Method 
First and second grade books from seven tratal tondina series woxe ., 


_ Surveyed (see Table 1). Publisher’ s guidelines wete used to determine which 


books ina sitike corresponded to first and second ante content. Teachers’ 


if 


manuals weré used to compile alphabetized word lists for each book in a 
. 
series. Unless specifically indiéated as "supplementary" (Houghton- 
Mifflin), ‘ enrichment" (Ginn), or "sounding vocabulary" (Economy), all words: 
. * ' 


were assumed to appear in the reading text and to be taught for mastery. 


i] 


- . ‘ «1,6 
Next, alphabetized lis(é- of all words in seven wiandavdixed tests 

and subtests of word recogastion were prepared, In all but two instances, 
reading tests and subtesta which involved’ sentence or satagrantl reading 
were excluded; the exceptions were the MAT Primary II Word Knowledge. 7 
Subtest and the SAT Primary I and Primary II Paragraph Meaning Subtests. 
For these tests, a list was made nth of those words aa vere correct 
responses, ' 

> The extent of overlap between each’ reading series and each achieve-— 
ment test could then be assessed by Soar etits test word lists with 
curriculum word lists to determine the total number of word matches per 
grade level. For ssnupls, of the 50 words taught in Economy, Level 2, 
(the first of .£ive books read din first grade).three words, "jump," "play," 
and "run" appear on the PIAT Word Recognition Subtest. Thus, Econony , 
Level 2, and the PIAT yield three word matches. yor words which appeared , 
in the same” fore both on the test and in the er a were counted as 
matches. Exceptions- included words with -s, -d, -ed, and -ing endings, 
which did\not change the root word. The words "walk" and "walking" would 
qualify as matches, but the words "ride" and "riding" would not, since 
the “e" is dropped: in "riding." Similarly, ha aera "hunger" and "hungry" 
would not qualify as matches. 8 ae é 

The PIAT, WRAT and SORT all have error ceilings, which if reached, 

conclude testing (see Table 2). Thus, it was necessary to consider the 
sequence of test words, when locating word matches. Some potential word 


matches were excluded since the error ceiling would have terminated testing 


before the word appeared. 


° 


Since word recognition tests are scored by one point for each correct 
. bs . e . 


word, the total Huber of, vord matches yielded a raw score. Raw scores 
were then converted to grade equivalent scores according to test manual . 
specifications. For dateiils;, “coriiactec of words from the first grade 
/level of the SRA Basic Reading Series (Book A-D) with the words appearing - 
3 on the SORT indicated 20 word matches (raw score = 20), yielding a obiae 
equivalent score of 1. 0, according to the following calculation: 3 (Book A 
‘words appearing on the soRT) + 2 (Book B words) + 8 (Book C words) + 7 - 
Book D words) . 20 Kise Beene) s Sanieeeed to grade ayivaient = 1.0. 
Two people independently matched reading test age  cureigbiua word 


lists. A third person compared lists of matching waits and reconciled 


disagreements. Raw, sbendaxdlead ‘and grade equivalent scores were also 
~'. ® 
computed by two persons j tedepeddentlys 


Results and Discussion - ou | 


f 


Table 3 summarizes the overlap between each achievement test ean each 


, 


reading series, for first and second grade levels. The overlap is" Teported 
in terms of achievement test grade equivalent scores that would be akoseved: 
given the words which appear both as items on an achievement test ard as 


instructional content in a reading series. , 
oN * 0 oP cata te eee el oie Seale 1 Sate ek ha 


{ 
f 


Inspection of Table 3'reveals clear discrepancies between the grade 


‘ 


equivalents obtained both between tests for a single curriculum and on a 


single test for different reading curricula. The extent of curriculum 
: ’ I 


10 © | . 


aa ; 8 
; oe | 


bias is not uniform across all achievement tests. At the first grade level, 


the MAT appears to exhibit the least curriculum bias in that scores from 


all seven reading curricula fall ‘within a agen ‘range (0. 4 Brace : : 
equivalences), ROmDOE RE $e a range of 1.2 grade: equivalent scores obtained 

, 
on the SORT. However, the MAT's consistently low grade. equivalent scores . Fs 


raises doubts as to the test's accuracy in describing actual grade level 


achievement. ‘Certain of the reading curricula seem to enjoy Gonsaucancly 


Ww 


nigh overlap with all five achievement, tests. At the eater grade level, 


the Beonony Series obtedin or ties with another- curriculum for the highest - 


sd ‘ 


erage Btiiek on five out of the seven tests and’ subtests examined in. 


* 


this. study. Likewise, « some | curricula at the" first grade level obtain low ©»: | 


oo 
grade equivalents across several tests. Ginn 360 obtains ox, ties neal une 


» lowest firét grade AUIS TARE score on six of seven teste and subtests, 


Houghton-Mifflin on two of seven, and ‘SRA pan two of seven. Some" luspiica-' 


tions of the apparent biases between achievement tests and reading ‘curricula 


—_— 
« 


are explored be below. 

Student Evaluation 
Suiawnka; teachers, and curricula are ‘all sibiect to ateluneton jawed 

on standardized test scores. Fora particular student, the westen aie 


often used to measure the amount of growth over some period. Children 
Ts Z 5 
making "normal" progress are saresce to advance one full grade equivalent 


for each year spent in school; “Examination of the scores ‘(Table a for 


fos 


any curriculum, however, reveals that the amount of growth varies depending 


‘ 


upon the particular test euptoved: Hypothetically, a child who learned. 


the content words in Grade 1 of Houghton-MLfflin by the end of Grade 2 


would gain one year and four ‘months or to the PIAT, one yeae: -and < 


' two months in Word Knowledge (MAT), este in Word spec aa (HA > othe 


¥ 


months according to the SORT, seven months in PaMaeseye Meaning (SAT), : 
and..only -four monthe on the WRAT. | a | 
An equally distorted oficture is presented for Sullivan pureteuzun 
growth as measured by the five tests. Using the PIAT scores, a child 
i would be judged as "average" at the. end of first grade, but by the end - 
of second grade vould have gained only four months. | Although 16 months et; 
“growth is shown for the third grade, it still appears that the chi}. would » x 
enter foutth erade behind .grade level. If growth in that same curriculum / : 
is eeasuted with the SORT, only ‘one month's growth. is indicated: by .the 
end of Grade 1. That can be contrasted with second and third grades 
when 14 and. 18 months progress is possible. If the SORT were substivoted. 
for the PIAT, the child completing three years of. the Sullivan ‘curriculum i 
will begin fourth grade somewhat above grade level. Which ‘test results 7 
should ‘bé believed in evaluating the child's progress? It appears that 
ra measured ptanteas may be more reflective of test and curriculum combinations 
“pha of teaching and sennaEne: A second. grade teacher using SRA ad 
"produce a child pesaing at or above grade Level by the end of wiedud 
grade merely -by wees the WRAT or PIAT pian of the MAT. If dramatic 
"growth" is desired, s/he could use the SORT aid ‘obtain 19 months gain from. - 
the end of firat to the end of second: grade. ” | 
Provision of special education and other supportive services are based 
“to a significant degree on children's achievement test performance. For. a. 


4 


child to ‘be. ‘classified as educable mentally handicapped or LGaratng, ‘disabled 
Sgt 
' in most states, siie must score below his/her grade qepentancy. ‘on an achievé-: 


x. 


ment canes Low achievement test performance ‘is also ved as corroborative 


12 


\ 36 


evidence for emotional disturbance, Federal Title I and Title-Vil guide- 
lines include an achievement criterion in identifying dandinaten for services. i 
- How many times have ieccmuetdattone for retention or apettae class placement 

: been prompted or supported by distorted test results? Often, decisions 
made during a staffing about a child's educational Phachmene are based on 
normative data, with achievement test cbaiike serving as the péimary source 
of information. | ) zh ag | 
Curriculum Evatuation : : ie? . a 
~* ie addition to using achievement test results to measure papts growth, 


t . 


‘administrators might ‘use achievement’ test results as 4 means ‘of avalintiag 


a, particular suerlcaiun (Ss tanford Reseveeest Test Primary I Battery f 
“Adminietering, 1964; Metropolitan Achtevenent Test Primary 1 ao a 
Handbook, 1971, ) Suppose that a school: divteivé accepted thig suggestion, — Oa) 


and field tested several reading series as different classrooms for two* 
years before {deciding which reading setae to adopt. If the school dtetrict 


aused selected students’ scores on the PIAT to evaluate the ‘different reading 


series, they would sieebabis select the Economy or Houghton-Mifflin texts. 
it, instead, they used the WRAT to evaluate growth, they would probably , 
choose SRA, Reoncay ot Bank Street Readers. It a highly doubtful that 

conventional achievement tests can pai as unbiased estimatea of a 
curriculun's worth > at least at the early grade -levels. Perhaps, at a 
time when all wotd recognition skills should have been mastered tea’ 
Grade 4), oben an achievement test would not be seriously biased toward 
any sarees: pha aaccuaand at least by virtue of the vocabulary, it 


~~ contains.” “However 3= other -soutces” Or bias exists snaeete ‘be atecussed i I 6 al 


» r 
latex. 


i 
ha- 
. 


Tedcher Evaluation 
Like children and curricula, teachers may also be sdiject to evalua~? “8G 

tions which enploy standardized test resulta: With the emphasis ea _ 

Accountability in education, a teacher's ability. ay be judged by the te ' 


- number of ehileres in his/her class who are at or near exade level a 


; According to year-end achteveneut tests. A first gli teacher using — 


the Sullivan Programmed Readers would be rated highly by PIAT or WRAT a | 
results. That. same teacher might appear. quite inadequate if the MAT, | 
SORT, or SAT were used. Similarly, what can. be said in defense of the 
second grade teacher using Sullivan whose class "pained" only four montha 


on the PIAT.: The widespread reluctance ‘of teachers to be held account- 


fable for their performance may be Justified, especially at their “effects 
are measured oy biased’ instruments. The right coub{nattod of: curriculum ~*~ 
and achtevenent. test may enhance the "effects" of a poor teacher, whereas 


the unfortunate combination of test and curriculin may penalize a * good 
p 2 ese : : @ 


teacher, : - v4 ay es 
‘Reading Placement (Diagnosis) 


aehtavenent test grade equivalent scores are useful, according to 
a 


some wuthorteies, in placing children within a ‘reading series - According . og 


to the results reported. in Table 3 one sees that the. accuracy of place-~- 
‘| ‘ 
ment decisions is greatly affected by che combination of achievement test 


1 
a 


- and reading curriculum. A student who mastered first grade Ginn vocabulary 


j S 
would obtain a SORT grade equivalent. of Lede ° The same student, if he had : 


« : ( 
; read in itconoay, could expect a score of 2. 2. What is the. proper interpre- 
~~ tation Of these nr ge “Do they indicate ce “a student veing Gian fe a Fe 
j Z ' : % re 
not really:prepared to read second grade material, but that one using a tne 
Economy is? : ce. alt | v. 5. oe 
f d 
so Ml ae ; 1 4 . « ; ; 
‘ON : i ga 
€: i,” e * - 


The inadequacy of schdavinent test results for making placement 
decisions can be further illustrated. Suppose that a child is new to 
a school as a second grader. In September his new teacher administers 
the SORT so that a placement in Macatiien can be made. The child, having © a y 
réad neore 1-7 of the Sullivan Programmed Readers at his soraar school ue | 
scores a athe equivalent of iG Other students in the class “(who = . a. ; 
fihished the Macmillan first grade readers) receive on the average a 


it 
grade eiiivelant of 1. “7s close re grade level. The teacher might conclude 


" that the new child is a paeneadate< and that s/he will not "fit" with 
ie rest of his/her aieond grade. | The teacher might request scppdntiee” 
services for the new student, or ee consider a special education _ a + : 
placéastts: However, if the sane child vere\ given a WRAT, a grade equiva-. i 

; lent of 2.0 would indicate that fe, too, is reading at grade level, and 
is only a little behind his Staswnaten who j given their Macmillan back- 

" jareleads ‘could be, expected to obtain a WRAT score of 2.3. In this case 
the teacher would ‘probably oe the child could safely be placed in. 

a "middle" reading group, ‘beginning ‘a’ 2-1 reader. ) 

A teacher relying on grade equivalent scores to make a St iteuact 

. Sactaton in a particular curriculum may BF led to radically. different 


- 


“i esaclueions depending upon the selection of schdaveaene test and ane 
“ child's previous reading curriculum. All a teacher knows after ateindas 
tering a standardized test is how many words on and parcscular test a 


child knows, and how that score compares to other children in the class, ° 


and to some children on von \the test was normed. What the ache eed 

Rot. know 48. vhich words_a_child can read ina- particular. rea eu er eas ey 

” Jt As that information which is needed to place a child at an’ a 
atin 


— ee yo 
instructional level in-a given curriculum. 


H 
ym ‘ 
/ . F . 
7 F 7 e , : ' 
t r bi x 


’ = ‘ . 2 
4 s . s ! 
‘ttn f> “iam i : 1? : ' . | 
i Pat pare - sche. 4 m= 1 


NN 


type of special education services, etc.) is contoued with different 


‘ments. 
eontest carefully the curriculum used by different treatuent groups. 


from any study where the: dependent variable was student achievement. 


+ measured by conventional instruments would be significantly attentuated, 
| conditions, 


but systematic biases between- curricula and achievement tests. 


Educational Research ae - ; ew r 


. be 


Applied researchers in evenrien have understood that in order to 


- assess the relative effects of any independent variable on student achieve- 


~ 


ment, all other variables ditch could edncetvably influence student. . . 
“achievement. must be controlled. Many aPdTES particularly those con- ae rae 
ducted in normal school settings, have ecpeatea in which the independent | . . 
variable under study (e-8!, teacher-pupil ratios: classroom organization; } | 


+ 


classroom curricula, In‘some, reaearely reports, the authors do not feel ate 


compelled even to mention whether curricula were controlled ackona treat- ‘ i, FF 
_ The, assumption that achievement tests were uabianed samples of 


commercial curticula is, apparently, reapoanibia for the failure to 


‘The 


results of the present iavestigation would suggest that ‘conclusions drawn 


| 
unless the classroom curriculum wes carefully controlled across treatment 


Inconsistent findings from study to study, so familiar in 
} . 6 2 
the education literature, may in part be accounted for by uncontrolied 
ti y . : a 
Conclusions 
“The data from the present investigation strongly, suggest that a 
basic assumption underlying srandentiaad: achievement measures, that they 


representatively sample different Seeeiig: cannot reasonably be held; 


"—~“elear, significant biases exist. “The nature of this bias is ‘euch that 
’ 4° ‘ 


student ‘achievement in a particular curriculum may in no way be reflected 


. 160 


m3 . iS aan & 
. y Se 
ee ; / ys 
by achievement test scores. Such biases must be acknowledged and con- 
‘sidered any time that a standardized, norm-referenced achievement test 
is used for decision making. ve . 2% - , | i 


| In all likelihood, achievement test bias extends beyond measurement /— 


of ecngier yard recognition skill, Reading comp ehension tests also 


require children to read a specific set of words and respond to them 
‘/@ 


some fashion (e.g., recall particular fucka, draw an anrerence or cupply 


missing words). tf words that compass test it on a tending conrehension 
tests are more eoseciows with ens reading curric lum than with ‘another, . 
| : va then children’ 8 ructobintice on ‘ich test (s) may 1180 be affected. To 
compound, matters, reading comprehension tests inelude additional sources 
of, patent tas bias such as question format (e.g. eloze vs. miltdpte choice) 
sentence construction COy8is multiple vs. rai and topic (e.g., base- 
.ball vs. sewing) over aad above’ those sees of bias foind’ in word . 
. Fecognition measures. Thus, the problem of achievement test bias does 
. °° mot conveniently disappear when vegding comprehension tests are substituted 
’ for word recognition tests; instead, the problem grows. 
What educators need is ak instrument to neasire “learning that is‘ 
sensitive. to curricular: differences. Some fore of ‘briterion-referenced 
or curriculum-based assessment may provide the solution. pore and 
direct measures of a child's performance in a.specific eurricitin, should 
reveal wnat skills within the curricula have or fate not been mastered, 
as well as provide some index of progress which would be sensitive to 


. i of 
what was being taught. z 3 wh; a 


*$ 


coon Oe as Se ee Seo / ae o 


technical assistance and to Barbara Wilcox and Judith Arter for their 


comménts- or an earlier version of the manuscript. 


» 


“ey” 


a Reference Notes : . 


1. Brown, S. A comparison of five widely used standard zed readin 


for a selected grou "hog 


tests and an informal reading invento 
Unpublished doctoral dissertation, 


of elementary school children. 


University of Georgia, 1963... 


. 


doctéral dissertation, University of Oregon, 1964. 


ae Glaser, N.* A comparison of specific reading skills of advanced and ~ 
. retarded readers of fifth grade reading achievement. Unpublished © = 


i) . 
| ey | 2 | 
oy ¥ 
5 * 
. ‘ . 
. _* 
’ 
‘ 
‘ ¥ . 
- 
‘ 
. 
4, 
4 . 
« . . 
’ « * 
- 
«3 
Ne . 
. he . & 
sd 
. 
Sas 
e 
t ? ~*~ 
dh w \ 
« 
dl 
’ 
/ 
« ‘ 
* 6 
z ‘ 
A. 
Fs “ . 
‘ > ve 
, 
3 * 
. 4 
Ad 
- 
FE ® a sa ‘ . = 
od 


+ , m3 Y eg _— Pn : " ss ‘ 
: ats a : é WwW. ost, 
\ Bix Ney ‘ 
Ai "AR 
Reeognition. “« 
| Reading Test. R. Slosson.:. East, Aurora, NY: Slosson 2° 
_ | Educagiona} Publications, 1963. ee - ee 
\, Stanford Achievement Test, Prins \ and II Battery. T. Kelly, R. Madden, 
E. Gardner \& B. se - New York: Harcourt, Brace, and World, 1964. 
' N \ 7 rd ; 
A@hievement Test. J. Jastak, S. Bijou & S. Jastak.- Wilmington, DE: 
Guidance Associates, 1965. 
‘ . P 
PF \ a me 
/ \ *,*® 4 % % 
’ \ v = : 
4} oe : 
\ . ’ 
Hh 
“f ¥ ' . 
+ a 7 pe % 
hs : i ; ‘ 
. bea { 
4 
| i 
‘ ‘4 | ue % ' 
‘ Rohe » *s : ‘ } 
mo | ie : | } , 
/ 4 f az 
\ 4 , 
Hh | 
* r } 4 ° 
| ae © 
* : a } : : 
“4 6 1 i ae, ud 
we ie > ==" ‘ . 
- OO: fe Nes | 


References 


Black, I. S. The Bank Street Reading Series, New York: bincnd Lied, 1965. 


Buchanan, Ce BD, Sullivan haucetiaves Programmed Reading (Series 2, a tS 
revised, mestion)s New York: McGraw-Hill, 1968. 


a Carver, R, Read{ig tests in 1970 vs. 1980: Peychonetric ve. edumetric. 
Reading Teacher, ' 1972, 26, 299-302. 


Chall, a - Learning to read: The great debate. New York: McGraw-Hill, 
1967, ; / 


y 
Clymer, T, Reading 360. Boston: Ginn and Co., 1969. 


Dunn, L., & Markwardt,: F. "Peabody Individual kchiivenent Test Manual. : 
Circle Pines, MN: American Guidance Service, 1970. ‘ 


picewk; We, Bixler, H., Wrightstone, J. W,, Prescott, G., & Balow, Is * 
Metropolitan Achievement Test Primary I Teacher's Handbook. New York: 
Harcourt, Brace, Jovanovich, 1970. | 


Eaton, M., & Lovitt, R, Achievemént tests vs. direct and daily measurement. 


a G. Semb (Ed.), mehayier analysis and education. University of. fF Kansas, 
972. 


. Farr, R. ‘Reading: What can be measured? Newark, DE: International ‘ 
_ Reading a cea a 1969. Rs 


Harris, A., & Clark, M. The Macmillan Reading Program. New York: 
Macaillan, 1970. oa rss bd 


rv’ 


‘Harris, T., & Creekmore, M. Keys to reading. Oklahoma City: Economy, 
1972, 


Jastek, J. F., & Jastak; 8, R. The Wide Range Achievement Test Manual. 
* Wilmington, DE: Guidance Associates, 1965. . Ss ; 


; Kelly, T., Madden, R., Gardner, E., & Rudman, H. Stanfotd ‘Achievement 
Test Primary I Battery Directions for Aeinistertos: “New York: 
‘Harcourt, Brace, an a World, 1964. 


McCracken, R. A. stendapdine’ reading tests and inforest reading 
inventories, Education, 1962, ‘82, 366-69. 


McKee, P., Harrison, M. L., McCowen, A., Lehr, E., & Dunn, W. Reading 
for meaning (fourth edition). New York: Houghton-Mifflin,, 1966. 
wt 


; Pophaa, W. J. An approaching peril: Cloud-referenced tests. Phi Delta 
Kappan, 1974, 55, Bitlis 


Rasmussen, D.,. & Goldbert, os The SRA Reading Program: Basic Reading - si 
Series. Chicago: Science fabeerce Associates, me 


Sipay, E. A comparison of stanbacdvand reading scores al functional 


& 


reading Yevels. The Reading Teacher, 1964, 17, 265-68. © 


p" 


19 


Table 1 
Reading Curricula Grade Levels | ua Noe 
; Series . : - " ; 
(Publisher) . Level : : ‘ Grade 
Preprimers (2) ect i 
The Bank Street Readers Primer (Around the City) a tote 
is 1-1 (Uptown, Downtown) 1 es 
‘ (Hacmi lian 1965) 2=1° (My City) 2 ; 
ft 2-2 (Green Light Go) Zz 7 
a ° 2 (Pug) 
Keys to Reading 3 (Sun Tree) . 


« 


‘+ (Economy 1972) 


9 
\Reading 360 
(Ginn and Co., 1969) ° 


‘Reading a Meaning 
{Houghton-Mifflin 1966) 


‘ *. \ 


Macmillan Reading aimee ts - 
“Primary ‘Grades 


: 


(Macmillan 1970) 


The SRA Reading reegeen 


Sulliven Assoctates Programmed 
: b Reading 
+) * (McGraw-Hill 1968) 


Loe mentee 
ert a te 
- 


x o - * 7 | f 


6 (Blue Dilly Df1ly) 


* 2=2 (On We Go) * * 


(Science Research Asgociates 1971) D (A King on a Swing) 


° G (Tony's Adventure) _ 


i 
rs 


4 (Zip! Pop! Go!) 
5 (Green Feet) 


7 (Curbstone Dragons) 
8 (Mustard Seed Magic) 


NY KF eee 


2 (My Sound and Word Book) .- - 

3 (A Duck is a’ Duck) 

4 (Helicopters and Gingerbread Men) 
5 (May I Come In?) ' 


6 (Seven is Magic) 
7 (The Dog Next Door and Other” 
Stories) 


NN fee 


Prepriners (3) 
Primer (Jack and Janet), 
1-1 (Up and Away). 


2-1 (Come Alohg) 


NN eee 


i] 


Preprimers (3) © 
Primer (Worlds of Wonder) 
1-1 (Lands of Pleasure) 


2-1 (Enchanted Gates) 
2-2 (Shining Bridges) 


NN He 


A (A Pig Can Jig) ~”"* 
B (A Hen in @ Fox's Den) 
C (Six Ducks:in a Pond) * 


E (Kittens and Children) 
F (The Purple Turtle) 


NNN BP eee 


| 


Prieer 

Books 1 through Book 7 
Books’ 8 through Book 14 
Books 15 through Book 21 


> 


wre ee 


Table 2 


oe Scoring Criteria 


«| Wide Range Achievement Test (WRAT) , 


_ Error ceiling: 12 consecutive errors in word nerogattion 
Raw score: ‘ number of correct words plus 25 
Assumptions: child can identify 13 letters of the alphabet, 


match ten identical letters, and can identify 
‘ two letters in his/her name (25 points) 


bed . 


Peabody Individual Achievement Test (PIAT) : 


Error ceiling: five errors in seven consecutive words. 
Raw score: “error ceiling word number minus. total number 
' of errors — 
Assumptions: a child can identify letter names and can natch 


identical letters, words, and pictures (18 points) ; 
; starting point (basal level) is the: singe word | 
" ; ; (item number 19) } 


Metropolitan Achievement Test, Primary I and Primary II (Form F) (MAT) and — 
Stanford Achievement Test, Prima and Primary II (Form W) ° (SAT) . 


- 


Error ceiling: none . 
Rew score: number correct : 
Standardized score: conversion table provided in test manual 


Grade equivalent score: conversion table from standardized score provided 
. in test manual 


* Primary I was used to calculate grade one scores. : 
Primary II was used to calculate grade two scores. First as well as 
secénd grade words were matched to the Primary II test words. 
The MAT Primary II Word Knowledge subtest and SAT Primary I and Primary II 
Paragraph Meaning subtests are multiple choice tests which: involve reading 
* @ sentence or paragraph. Since a curriculum actually may not include the 
. words that must be read to select the correct word answer, scores on those 
tests appearing in Table 3 may be inflated. . 


Slosson Oral Reading Test (SORT) 


Error ceiling: 100% incorrect words in a column of 20: words 
Raw score: number correct 
* Grade equivalent: Raw score divided by two (table provided in 
ateeetsons) 


‘ 
errant eA ee ee nome — nent scammer, <npen eset ey rere ratr ter  rthnney h 
3 - 


bs 
Table 3 
'Grade Equivalent Scores Determined by itching spas 
Reading Text Words to Standardized Reading Test Words : 
Curriculum PIAT MAT SORT ' SAT _ / WRAT 
, Word Word ° . Word Paragraph | 4%‘ 
Knowledge Analysis oe, Reading Meaning 
. Bank Street : “8 8 
' Readers wees pe aN 
Grade 1 BS 140 ai io « ~Se0 1.6. 2.0. 
Grade2- = 2:8 2.5 1 , 24 - 28 7 BF 
» Grade 1 2.00 1k ie * " 1.8 /) -2.2 
. 3-3) | (1.9) (1.8) (2.7) a. ) (2.3) . Gd) 
Grade 2 3.3 1.9 < 1.0 3.0 2.6 3.0 
(3.8) (3.7) (2.0) ¢ (3.5) - (3.2) (3.6) 
Ginn 360** ‘ ; 
* Grade 1 1.5 < 1.0 < 40 1.4 4 *1.0 1. . 
(1.5) (<1.0),, (1.0) (1.5) (< 1.0) ° (1.5) (1.8) 
Grade 2 722- ame 2.7: - 1.9 2.3 : 
- (2.8) (2. 5) (1.1) (2.7) = (1-9) «,(2. 5) 
Houghton- ~ * of . 
Migflinass ° 
Grade 1 2.0 1.1 < 1.0 1.6 < 1.0 i - 
(3.1) = (1.9) (1.7) (2.9) (1.7) (2.4) (3.1) 
Grade 2 3.4 2.3 < 1.0 2.4 2.3 ‘ 2.4 
Macaillan a 
Grade ka 1:8 -+- nt 1.0 é 1.9 1.1 4 : 1.6 2.3 ie : 
Grade 2 F . 2.2 *-2.5 * 1.1 2.9, - . 2.5 2.6 
SRA : * ~. : “ 5 he 
Grade 1 1.5 1.2 1.3 10 - 1.0 1.6 3 é 
Grade 2 92 + 25 1.4 2.9 - 2.9 3.5° a 
< j P ‘ a kg 
Sullivan : : oe 
Grade l 1.8 . 164 °&#81.2 ome Fe i323 16 . 2:0 Aye 
- Grade 2 2.2. a: ot 2.5 - a4 2.5 i 
- Grade 3 4.8 ° - os: 4.3 - - 3.5 . 
ae ee ,5cores ‘4a paceathence’s reflect the : inclusion of » words ) listed as. “sounding vocabulary.’ | |. 
ae " . on " " now " “enrichment :". | : 
ane "lw " " " we ow " " taupplensatares” te, 3 
* Ss ; : : : r . ‘ \ 
; 24 E a3 


